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Abstract —We propose a novel algorithm for testing the 
hypothesis of nonstationarity in complex-valued signals. The 
implementation uses both the bootstrap and the Fast Fourier 
Transform such that the algorithm can be efficiently implemented 
in 0{N\ogN) time, where N is the length of the observed signal. 
The test procedure examines the second-order structure and 
contrasts the observed power variance—i.e. the variability of the 
instantaneous variance over time—with the expected characteris¬ 
tics of stationary signals generated via the bootstrap method. Our 
algorithmic procedure is capable of learning different types of 
nonstationarity, such as jumps or strong sinusoidal components. 
We illustrate the utility of our test and algorithm through 
application to turbulent flow data from fluid dynamics. 
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I. Introduction 

In this paper we introduce a new automated method of 
detecting nonstationarity in observed time signals. Identifying 
and dealing with nonstationarity is a problem of fundamental 
importance in machine learning 0-0 . Nonstationarity can be 
hard to detect, because while there is only one way for a signal 
to be stationary, there are many ways for this assumption to be 
violated Q. Our approach is based on examining the variance 
of the observed signal. As we assume the signal is zero mean, 
or that the mean is removed, this is the simplest way that the 
signal may depart from stationarity. 

We focus our attention on bivariate signals, that are an 
important class of observations. Such data are commonly 
represented as complex-valued signals, as discussed in Q, and 
as performed in numerous applications including fMRI |6|, 
blood flow fT) , neural networks ||8| , oceanography |9|, 
seismology |10) , and meteorology inr The complex-valued 
representation is particularly useful for modelling trajectories 
of fluid particles in oceaongraphy and related applications p^ , 
also referred to by the name “Lagrangian data.” In Eig. 1 we 
display three trajectories of particles propagated in a simulation 
of forced-dissipative two-dimensional fluid turbulence. These 
trajectories exhibit various degrees and types of nonstationar¬ 
ity, and we will test our procedures on this data in this paper. 


To develop our test procedure we draw inspiration from 
existing methods to detect nonstationarity in time signals. Here 
tests have been developed for real-valued data, focussing on 
different structural behaviour of the observed signal. Standard 
procedures are formulated in the frequency domain GD-GD 
or the wavelet domain GD’Gl)- We propose to develop theory 
from the frequency domain, however, to capture nonlinear 
characteristics of the observed time signal, we develop a test 
in the time domain. Specifically our test procedure examines 
the power variance of the observed signal, i.e. the variability 
of the instantaneous variance over time. 

This is, to our knowledge, the first test for nonstation¬ 
arity designed specifically for complex-valued signals. We 
will demonstrate that testing the power variance can both 
identify heteroscedastic signals, where the variance of the 
signal changes over time, such as will seen shortly in Eig. 1(b), 
and signals with strong sinusoidal components that appear 
‘phase-locked’ as in Eig. 1(c). The latter is a strongly nonlinear 
feature and is associated with relatively low power variance, 
as we shall demonstrate. We will also derive the expectation of 
the test statistic under the null hypothesis, and show how the 
test can be efficiently performed in O (A^ log TV) time, where 
N is the length of the observed signal, using the bootstrap and 
East Eourier Transforms. 

II. Second-Order Properties and the Power 
Variance 

We first formulate theory for continuous signals, before 
detailing how approximations are made with discrete ob¬ 
servations. Consider a continuous bivariate signal {xt,yt} 
represented as a complex-valued quantity zt = Xt + iyt at 
each time point t, where i = ©—1. If the complex-valued 
signal {zt} is second-order stationary then {zt} has a finite 
constant mean, and an autocovariance sequence E{ztZf_^) that 
is finite and does not depend on t, where z^ denotes the 
complex conjugate of Zt- To fully describe the properties of 
the complex-valued signal, we also need to describe its relation 
sequence 0, given by E{ztZt-T), which is also independent 
of time for a second-order stationary signal. 

A second-order nonstationary signal does not satisfy one 
or more of the constraints of E{zt), Fj{ztz^_^) and E{ztZt-r) 
being independent of t. There are a multitude of ways 
that this can happen. We focus on the simple case where 
Var(zt) = E{\zt\‘^) varies with time, such that the signal is 






Fig. 1: The trajectories of three different particles propagated in a large two-dimensional fluid turbulence simulation. 


heteroscedastic. One simple class of such models is that of 
uniformly modulated processes Q, where zt = CtUt where ut 
is a stationary process, and at is a deterministic signal. Hence, 
a measure of nonstationarity during the time interval (0, T) is 
the time mean of the squared deviation of the variance from 
its time-mean value, which we term the power variance, given 

by ^ 

n{T) = ^[ {Vai{zt} - dt, (1) 

where the sample variance, or average power, is given by 

d'^ = ^ J Varjzt} dt. (2) 

It is clear that for a stationary signal, the variance of Zt 
is constant over time such that fl(T) = 0. However for 
nonstationary signals it would commonly be the case that 
r2(T) f 0, depending on the form of nonstationarity. 


If we now consider discrete observations of {zt}, then we 
do not observe Varjz*} directly, so instead we calculate a test 
statistic. For a regularly sampled complex-valued signal, z = 
{zo,Zi,Z 2 , .■.,Z]sf-i), we define the observed power variance 
as 

, N-l 2 

= iv ^ 

n—0 


where 


1 

N 


N-l 


n=0 


(4) 


For a discretelj' observed stationary signal, it is not typically 
the case that H(z) = 0, in contrast to H(T) in |^. This is 
because of the natural stochasticity in zt, which yields f- 
at each observed time point. It is therefore possible for a 
nonstationary signal to have a smaller observed power variance 
than a stationary signal—a feature which we account for in our 
algorithm by performing two-sided statistical tests. 


HI. Bootstrap Test Procedure 

To test whether an observed signal is stationary or not, 
we first generate a number of randomised signal replicates 
that are constructed to be stationary, where each replicate has 
the same overall variability as the observed signal. We then 
contrast the power variance of the observed signal and that of 
the replicates. If the power variance of the observed signal is 


significantly different from the typical power variance found 
in the replicates, then we can reject the null hypothesis of 
stationarity, and identify the signal as nonstationary. 

We generate the signal replicates using the bootstrap 
method | |l9) . We do this in the frequency domain by following 
the exact procedure of |^. This simple procedure takes 
any complex-valued signal and calculates its Discrete Fourier 
Transform (DFT) given by 

N-l 

(5) 

n—0 

The Discrete Fourier Transform is complex-valued and we 
have decomposed this in terms of an amplitude \Zk\ and 
corresponding phase fk at each frequency k. The idea of pO) 
is to fix the amplitudes \Z^, \ but uniformly generate randomised 
phases, fk ~ ld{— 71 , 71 ), and then perform the inverse DFT 
into the time domain to generate a randomised replicate of the 
signal z = (zq, zi, 22 , zn-i) given by 

1 ^-1 
k=0 

By definition, the frequency components of a stationary signal 
have uncorrelated phases, and hence the replicated signal z 
(which has completely random phases) is stationary, even if 
the observed signal z is not. See also | [20) in relation to this. 

Our test procedure generates many such replicated signals 
and compares the power variance with that of the observed 
signal, to see if there is a significant difference. We detail the 
exact test procedure in Algorithm 1, where FFT denotes the 
Fast Fourier Transform and I( ) denotes the indicator function. 
If the null is rejected then there is sufficient evidence to suggest 
that the observed signal is nonstationary, otherwise the null 
should not be rejected using this test procedure. Note that 
this is a two-sided test procedure—we have tested whether 
the observed power variance is significantly lower or higher 
than that found in the distribution of stationary replicates. One¬ 
sided tests can be performed by using the values of q{z) or 
r(z) in Algorithm 1, instead of p(z), to respectively test for 
significantly high or low power variance. 

By using the FFT to carry out the Fourier transforms, we 
can achieve O {NlogN) computational cost, for each bootstrap 
iteration. For example, performing a test on one complex¬ 
valued signal with N = 1000 and B = 1000 on a 2.6 






Algorithm 1 Bootstrap Power Variance Test 


INPUTS: z = {zo, Zn-i), B 
{Zk} ^ FFT(z) 
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_ - 

N 2-^n—' 
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for 6 = 1 to i? do 

(j)k U{—'K, tt), for k = 0, iV — 1 

zlW ^ iFFT(||Zt|e*}) 




N-1 

n—0 



^N—1 
N Z^n=0 




end for 

^(z) ^ 5 Ef=i Ip < ^(z)] 

p(z) ^ 2 min {g(z), r(z)} 
if p(z) < 0.05 then 
Reject null 

end if 


GHz 2012 Macbook Pro takes 0.072s. As the bootstraps are 
generated independently, the method is straightforward to run 
in parallel with larger datasets. We note that bootstrapping, 
phase scrambling, and similar procedures for generating time 
series replicates are discussed further in pT[-||2^. 

IV. Illustrative Examples 

By way of illustrative examples, we present three canon¬ 
ical cases for investigating our test procedure. All code in 
this paper is generated in Matlab® and is available online 
from http://www.ucl.ac.uk/statistics/research/spg/software, and 
all results are exactly reproducible. Also free to download is 
an algorithm for applying the test procedure to any observed 
complex-valued signal. 

The hrst example we present is an autoregressive AR(1) 
model, where the sequence {zn} is dehned by 

Xn = 0.9a;„_i -I- 0.1cr^^\ where A/’(0,1), 

yn = 0.9j/„_i -b where A/'(0,1), 

Xn + iyn 

V2 ' 

This model generates 
stationary signals. We 
now check if our test 
procedure correctly 
does not reject the 
null for realisations 
from this model. 

We set N = 1000 
and B = 1000 in 

Algorithm and 

compute p(z) for 
a signal generated 
from this model. We 
perform a Monte Carlo 
simulation and repeat 
this procedure 10,000 



p-val 

Fig. 2: Distribution of p-values cal¬ 
culated from p(z) in Algorithm 
for AR(1) process. 


times with independently generated signals from the AR(1) 
model. We then get a distribution of p-values which are 
shown in Fig. with the 5% signihcance level indicated 
with a vertical red line. It is clear that there is no tendency to 
reject the null for this process, with 5.21% of null hypotheses 
rejected, which is consistent with a type I error level of 5%. 


The second canonical example is a basic jump process, 
with {zn} defined by 


Xn = where A/'(0,1), 

Vn = (X^n\ where A/'(0,1), 

\l + {xn+ iyn)l'J^, if n < /V/2, 

Zn = { j- 

1 3 -I- (a:„ -b iyn) 1^/2, if n > /V/2. 


This model generates nonstationary signals with a signihcant 
change-point midway through the series. We check if our 
test procedure correctly rejects the null for realisations from 
this model. Again setting N = 1000 and B = 1000, we 
generate 10,000 independent instantiations of such a jump 
process, and compute p-values using Algorithm This time 
we perform a one-sided test and report values for q{'z,) in 
Algorithm to check for high power variance. The one- 
tailed test p-values are shown in Figure with the 5% 
signihcance level similarly indicated with a vertical red line. 
In this Monte Carlo study, there is a tendency to reject 
the null for this process, with an alternative of increased 
power variance: 71.8% of 
null hypotheses are re¬ 
jected at the 5% sig¬ 
nihcance level. This oc¬ 
curs because the pro¬ 
cess is nonstationary and 
has a signihcant change- 
point, which stationary 
replicates will not have. 

This yields a signihcantly 
higher observed power 
variance in most cases. 

The null is not always 
rejected however as the 
signal-to-noise ratio is set 
reasonably low. 

The hnal illustrative 
example is a cyclo-stationary process, with {z„} dehned by 
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Fig. 3: Jump process with p- 
values calculated from g(z) in 
Algorithm 
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where A/’(0,1). 


We set w = 10 and a = 1 such that the signal-to-noise ratio 
is relatively low. This model generates nonstationary signals 
because the sinusoid is ‘phase-locked’ over time. For our 
simulations we again set N = 1000 and B = 1000, and 
perform 10,000 Monte Carlo replicates. We perform a one¬ 
sided test and report values for r(z) in Algorithm to check 
for low power variance. The one-tailed test p-values are shown 
in Figure with the 5% signihcance level again indicated 
with a vertical red line. It is clear that there is a tendency to 
reject the null for this process, with an alternative of decreased 
power-variance: 82.3% of null hypotheses are rejected at the 
5% signihcance level. We observe a decreased power-variance 


































here because the fixed phase of the sinuoid implies that 
the amplitude of the signal will be less variable than the 
stationary replicates—which will not be ‘phase-locked’ and ex¬ 
hibit more variability in terms of their instantaneous variance. 


The illustrative exam¬ 
ples given above (Figures 
m are all based on sig¬ 
nal length N = 1000. 
Performance decreases as 
signal length decreases, 
however the null is still 
rejected in many more 
cases in the nonstationary 
examples than would be 
expected for a stationary 
signal, for even the short¬ 
est signals. Table |I] sum¬ 
marises these findings. 
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Fig. 4; Cyclo-stationary process 
with p-values calculated from 
r(z) in Algorithmic 


N 

AR 

Jump 

CS 

1000 

5.21% 

71.8% 

82.3% 

500 

4.81% 

57.2% 

60.6% 

200 

5.11% 

39.2% 

36.1% 

100 

5.16% 

29.0% 

24.5% 

50 

5.51% 

21.5% 

17.1% 

20 

5.94% 

14.8% 

11.7% 

10 

5.53% 

11.5% 

9.50% 


TABLE I: Percentages of null hypotheses rejected at the 5% 
level for various signal lengths N, for auto-regressive (AR), 
jump and cyclo-stationary (CS) processes. All settings except 
N are identical to those used to generate Figures 

V. Theory 

The bootstrap signal replicates, z, generated in Algorithmic 
are used to approximate a distribution of power variances, 
under the null hypothesis that the observed signal is stationary. 
In this section we derive the expectation of this distribution 
theoretically. We denote E(n(z)) as this expectation and 
analytically derive its form in terms of the Fourier amplitudes 
\Zk\ found in (|C). 

Proposition 1: For a given complex-valued signal z then 
the power variance of bootstrap replicates z generated using 
0* has expectation given by 

( N-l \ 2 N-1 'I 

(7) 

fc=0 / k=0 ) 

conditional on the observed signal z = {zq, zi, Z2, ■■■, zn-i), 
where \Zk\ is found using 0. 

This proposition is particularly useful when performing 
one-sided tests, in which one is looking for evidence that 
n(z) is either unreasonably high or low. This is because 
the expectation can be pre-computed before implementing the 
bootstrap. Then if the expectation is greater than the observed 
power variance when testing for a high power variance—or 
vice-versa for a low power variance—there is no need to 


E(f](i)) = 


N4 


perform the bootstrap as the null can already conclusively not 
be rejected at this stage. The proof of the proposition is as 
follows. 


Proof: First we directly compute from 0 and 0 that the 
expectation for the replicates takes the form 


E(fl(z)) = E 


N-l 


]V ^ " 


n=0 


N 


N-l 

El 


n—O 


= E -E 


N-l 




n=0 


(8) 


Examining the first term of 0 and using 0 we see that 


E 12 
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=7Vi 1^12^1- +VI w 




k—0 Iz/zk 


in which the Eourier amplitude coefficients \Zk\ are regarded 
as deterministic. This last expression follows as the phases are 
generated randomly meaning 


E 




{ 1, \f p = k and q = I, 

or if <7 = fc and p = I, 
0, otherwise, 


and thus the factor of two in the second term of 0 follows 
from the two possibilities by which the indices in the sum¬ 
mation can pair together (p = k and q — I, 01 q — k and 
p = 1). The first term at the right-hand side of 0 arises from 
the special case q = p = I = k. Equation can then be 
rewritten as 

9 1 ^-1 

= Ei^^n (10) 

\k=0 / fc=0 

The second term of 0 can be immediately found in terms of 
\Zk\ using the Parseval-Rayleigh relationship 


E 



N-l 

E 


n—O 


21 



N-l 

E 




2 


( 11 ) 


The proposition then follows by substituting in ( [T0| > and O 
into 0. ■ 


The full analytic distribution of 17 (z) cannot be easily 
found for a given signal z. Even if the observed signal is 
Gaussian, the distribution of power variances will in general 
not be Gaussian for finite N. Eor this reason we do not 
proceed with an analytic approach of performing the power 


variance test, and instead the bootstrap method of Section III 
is preferred. 
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Fig. 5: The real part of the velocity signal from the trajectories plotted in Fig. 1. The time length of the signal is 1000 days. The 
imaginary part of the velocity signal is similar in shape and magnitude, but the rapid oscillations seen in the real part typically 
coincide with oscillations in the imaginary part that are ninety degrees out of phase, leading to looping motions on the complex 
plane, which can be seen in Fig. 1. 



Fig. 6: Results from bootstrap power variance tests for the trajectories plotted in Fig. 1. Each plot shows the observed power 
variance (|), as well as the distribution of power variances from the bootstrap replicates (—). The theoretical expectation of this 
distribution is also given (|). The p-value of the observed power variance is reported above the figure, and values less than 0.05 
lead to rejection of the null hypothesis that the signal is stationary. 


VI. Particle Trajectories in Two-Dimensional 
Turbulence 

Quasigeostrophic (QG) turbulence p5) is a model for 
large-scale fluid motions in the ocean and atmosphere that is 
predominantly two-dimensional and is dominated by a balance 
between the pressure force and the apparent Coriolis force. 
Qualitatively, the fluid can be divided into two regimes— 
regions of swirling masses of fluid known as eddies, and 
regions in between the eddies. The positions of particles 
transported by the flow, referred to as drifters, are tracked 
over time, leading to trajectory curves. In this simulation there 
are some drifter trajectories which experience nonstationarity 
due to crossing eddy boundaries, thereby transitioning between 
eddy interiors and the ambient fluid, and others which do not. 
This section uses our test procedure to classify the trajectories 
into different types. 

As in Section m all code and data is available online at 
http://www.ucl.ac.uk/statistics/research/spg/software and all re¬ 
sults are exactly reproducible. Furthermore, the QG turbulence 
simulation can also be reproduced using the software available 
at jeffreyearly.com/numerical-models/. 


Fig. 0 shows three trajectories from the simulation. In Fig 
1 a), we see an example of a stationary trajectory, while Fig. 
1 b) shows an example of a nonstationary trajectory comprised 
of portions with multiple distinct flow characteristics, and Fig. 
[TJc) is another nonstationary trajectory dominated by a single 
eddy. Fig. shows the same examples, with the real part of the 
drifter velocities plotted, and we can again observe the same 
characteristics of stationary and nonstationary behaviour. 

Fig. i shows the results of applying the bootstrap power 
variance test of Algorithm [T (with B = 1000) to the complex¬ 
valued velocity signals. Included in the figure are the corre¬ 
sponding variances of the null distributions, along with the 
observed power variances calculated according to Q, and 
the means of the null power-variance distributions calculated 
from (j7|i. In Fig. |^a), the observed power variance is in the 
middle of the null distribution (and hence the null is not 
rejected), because this signal appears reasonably stationary; 
the p-value here is calculated as p(z) in Algorithm In Fig. 
I^b), the observed power variance is in the extreme upper 
tail of the null distribution (leading to rejection of the null), 
because this signal is clearly nonstationary: the p-value here 
is calculated as q(z) in Algorithm Finally in Fig. |^c). 































the observed power variance is in the extreme lower tail of 
the null distribution (and again the null is rejected). This is 
because this signal is nonstationary in a different way: the 
sinusoidal oscillations are ‘phase-locked’ in such a way that 
the instantaneous variance of the signal has less variability than 
stochastic stationary replicates. The p-value here is calculated 
as r(z) in Algorithm fn 


VII. Conclusions 


In this paper, we have proposed an algorithm for testing 
for nonstationarity in complex-valued signals. The algorithm 
is based on calculating the power variance, a measure of the 
instantaneous variance of power of the observed signal. The 
algorithm can learn two different types of nonstationarity, this 
leading to a two-sided test. Both tails are indeed informative: 
if the power variance is too large, then this indicates that 
the variance of the signal is too variable to be classified as 
stationary. On the other hand, if the power variance is too 
small, then this indicates that the variance of the signal is less 
than that expected under the hypothesis of stationarity. 

To perform the test, the algorithm employs the bootstrap 
method. The bootstrapped signals have the same overall vari¬ 
ance, but are stationary by construction and therefore have 
constant variability. By comparing the original signal with 
these replicates we can determine if the original signal has the 
characteristics of a stationary signal. We do this by comparing 
the power variance. Significantly high power variance indicates 
a heteroscedastic process which is nonstationary. Significantly 
low power variance indicates a very small degree of variability; 
this is indicative of a strong ‘phase-locking’ whereby various 
Fourier components interact to generate a relatively uniform 
level of signal power. 


Developing fully assumption-free methods of detecting 
signal nonstationarity is of great current interest. Our focus 
in this paper has been on coupled pairs of signals; in general 
we would study signals of arbitrary dimensionality. In such 
settings it is significantly harder to determine natural test 
statistics, and extending our methods to such settings remains 
an outstanding challenge. Finally, while the focus of this paper 
has been on naturally complex-valued signals, the same basic 
approach should apply to real-valued signals that have been 
made complex-valued though taking the analytic part |26|, 
which is a promising extension for future work. 
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